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The present invention relates to a method for detecting the elements 
constituting microorganism flora, at least some of the elements of which have an 
5 operon in common, characterized in that the elements of said flora are identified by 
studying the intergenic sequence of said operon. 

The human digestive system harbors a considerable number of 
microorganisms which constitute a microbial flora of extreme complexity. 
10 Although these bacteria are distributed throughout the digestive tract, the colon 

contains most of the flora, both from a quantitative and a qualitative point of view. 
It is estimated that the colonic flora of an individual consists of 10 13 to 10 15 
bacteria, mostly anaerobic bacteria, represented by at least 400 species belonging 
to approximately 30 different genera. These bacteria colonize the various stages of 
15 the colon in a relatively heterogeneous manner. A "fermentation flora" in the 
cecum and also a "putrefaction flora" in the left colon are conventionally 
described. Moreover, a residence flora is distinguished from a passing flora. This 
residence flora is itself divided up into dominant flora and subdominant flora. The 
dominant bacteria, essentially anaerobic bacteria, are mainly represented by the 
20 Bacteroides genus, gram-negative bacilli, but also by the Bifidobacterium, 

Lactobacillus or Clostridium genera, gram-positive bacilli. The subdominant flora 
contain aero-anaerobic and microaerophilic bacteria. The presence of 
enterobacteria and of streptococci is especially noted. Diet, infections, intake of 
pre- and probiotics and also treatments with antibiotics are all liable to cause 
25 drastic modifications in the composition of the colonic flora. Since these variations 
have a direct impact on health, it is important to be aware of them and to 
understand them, both in order to avoid them and in order to trigger them for 
therapeutic purposes. To date, the study of colonic bacteria has made it possible to 
characterize approximately 200 species. However, this type of investigation comes 
30 up against many problems essentially associated with the laboriousness of the 
techniques. The results obtained and also the conclusions which ensue therefrom 
are still sketchy. 
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Bacterial identification is carried out according to various methods which 
employ either the demonstration of specific phenotypical or biochemical character- 
istics, or the use and recognition of specific heterologous regions present on the 



genome. 



The beginnings of an identification may be carried out by describing the 
morphology of the organism studied and by searching, for example, for the 
presence of endospores, of sheaths, of cysts, of buds, of fruiting bodies, etc. The 
form of the colonies, the pigmentation, and the origin of the sample are also useful 

10 pieces of information. Preliminary studies may also be carried out using dyes 

specific for the capsule, the flagella, the granules, the wall, etc. However, a more 
thorough and precise identification necessarily involves techniques comprising 
isolation on selected media. Such media are developed or improved in order to 
increase the specificity of the selection. However, it is difficult to completely 

15 exclude the presence of possible contaminants which may then interfere in the 
recognition tests subsequently employed. 

These tests make use of the biochemical characteristics specific for a 
species. Multitest systems exist. They are identification galleries which are 
20 provided in the form of kits, of microplates and of strips, the use of which is 

sometimes automated. However, there exists a certain degree of heterogeneity of 
chromosomal or plasmid origin within many species. Thus, one or more 
characteristics will be absent in the identification. Thus, most of the time, only the 
probability of belonging to a species will be given. A similarity of 80% or more 
25 with a reference bacterium will be considered as acceptable. Single-test systems 
have also been developed. They use synthetic fluorescent substrates for revealing 
the presence of an enzyme specific for a microorganism. They enable a rapid 
analysis, but are limited in their use. Specifically, a specific test must be developed 
for each species. 



30 



Immunoassays using poly- or monoclonal antibodies have also been 
developed. Besides the obvious limitation inherent in polyclonal antibodies, these 
assays are mainly used to characterize serotypes, but are rarely used to identify 
species. Although commonly used in hospital diagnosis, they remain relatively 
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unused in bacteriology. As regards the production of monoclonal antibodies, it 
remains long, laborious and expensive work. They are, for example, directed 
against lipopolysaccharides, the membrane, the pili, etc., but are, however, rarely 
specific for a species. 

5 

The development of molecular biology techniques has made it possible to 
develop novel identification assays. These techniques are based on hybridization 
reactions or polymerase chain amplification reactions (PCR). 

10 Genome/genome hybridization should be greater than or equal to 70% to 

identify an unknown bacterial species as being the known reference bacterium 
whose genomic DNA is used to perform the diagnosis. Other assays involve 
heterologous regions on the DNA specific for a species. The hybridization 
technique can thus be used, which consists in depositing the product to be analyzed 

15 on a nylon or nitrocellulose membrane and then in incubating with a specific 
labeled probe (cold probe or hot probe) {1}. 

It is also possible to use specific primers making it possible to amplify a 
fragment of a given size by the PCR technique {2, 3}. In this case, the amplificates 
20 obtained by PCR can themselves be analyzed by other techniques, such as RFLP 
(restriction fragment length polymorphism) {4} or TGGE (temperature gradient 
gel electrophoresis) {5}, refining the diagnosis. 

Despite their large capacity for discrimination, these techniques remain 
25 limited since they make it possible to analyze only one species at a time, or else 
consist in isolating a mixture of species which cannot then be identified without 
knowing exactly the pattern of analysis of the amplificates by a given technique on 
a given biotope. 

30 The development of DNA chips (biochips) makes it possible to envision a 

rapid diagnosis relating to several hundreds of species. This technique consists in 
placing on a surface area of a few millimeters squared several hundreds of DNA 
sequences specific for a given organism. These probes are hybridized with DNA 
fragments, generally obtained by RT-PCR. The possible hybridization of said 
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fragments is then observed, and indicates the presence or absence of the gene 
expressed, or of the organism studied. 



The nucleic acid targets studied over the last few years are essentially 16S 
5 ribosomal RNA (with more than 7000 available sequences) {1, 2, 4, 5}, the region 
separating the 16S and 23S genetic loci {3} and the elongation factors {6, 7}. 
Thus, using 16S RNAs as a basis, it has been possible to detect several new 
species, belonging to the Bacteroides and Clostridium groups, but not to the 
Bifidobacterium group {8}. Moreover, and although 16S RNAs make it possible to 
10 detect and to identify many bacteria down to the species, they are incapable of 
discriminating between the various species of staphylococci {2, 3}. Thus, a 
variable region of the gene encoding HSP 60 has been proposed for studying the 
microorganisms of the intestinal flora {9}. 

15 A subject of the present invention is a method for detecting and identifying 

the elements constituting a microorganism flora, in particular the intestinal flora, 
according to which a target which is even more discriminatory and universal than 
those already studied is detected and studied. 



20 The method according to the present invention comprises characterizing the 

sequences of this target for the organisms present in the microbial flora studied, 
and thus makes it possible to design a diagnostic test. 

Thus, the target studied in the method of the invention exhibits strong 
25 interspecies heterogeneity, which allows discrimination between the 
microorganisms. 

The present invention therefore relates to a method for detecting the 
elements constituting a microorganism flora, at least some of the elements of 
30 which have an operon in common, characterized in that: 



a) the genomic DNA of said flora or the mRNAs is (are) 



prepared, 



-5- 

b) at least some of the noncoding intergenic sequences located 
in the operon conserved in at least some of the elements of the flora are amplified, 
and 

5 c) the various intergenic sequences amplified are identified in 

order to determine the elements of said flora. 

Specifically, surprisingly, it has been noted that the intergenic regions, in 
the operons conserved between various species, exhibit a certain heterogeneity, 
10 whereas the coding regions which flank said regions in the 5' and/or in the 3' 
position are generally very conserved. It is possible that this fact is due to a 
relatively weak selection pressure on the noncoding regions in the course of 
evolution {10}. 

1 5 The amplification is preferably carried out by polymerase chain reaction 

(PCR), but other methods (PCR-like) may be employed, using a pair of primers 
having nucleotide sequences for implementing the method according to the 
. invention. 

20 The term "PCR-like" is intended to denote all methods using direct or 

indirect reproductions of nucleic acid sequences, or else in which the labeling 
systems have been amplified; these techniques are, of course, known. In general, it 
involves amplification of the DNA with a polymerase; when the sample of origin 
is an RNA, a reverse transcription should be carried out beforehand. A very large 

25 number of methods currently exist for this amplification, such as, for example, the 
SDA (Strand Displacement Amplification) technique, the TAS (Transcription- 
based Amplification System) technique, the 3SR (Self-Sustained Sequence 
Replication) technique, the NASBA (Nucleic Acid Sequence Based Amplification) 
technique, the TMA (Transcription Mediated Amplification) technique, the LCR 

30 (Ligase Chain Reaction) technique, the RCR (Repair Chain Reaction) technique, 
the CPR (Cycling Probe Reaction) technique, or the Q-beta-replicase amplification 
technique. Some of these techniques have since been improved. 
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The analysis of the amplified sequences is advantageously carried out on a 
DNA kit comprising sequences complementary to the sequences liable to be 
amplified from the elements of said flora. Knowledge of the microorganisms 
which may be present in the biological sample studied is therefore important in 
5 order to choose the DNA chip to be analyzed. Thus, it is necessary for the DNA 
chip to have, at its surface, probes specific for each of the organisms intended to be 
studied. Such a DNA chip is also a subject of the invention. 

Thus, the present invention relates most particularly to a DNA chip 
10 comprising, at its surface, a plurality of oligonucleotides complementary to the 
intergenic sequences of the various operons conserved between the species. The 
term "DNA chip" is intended to mean a solid support to which are attached nucleic 
acid fragments under conditions which allow hybridization thereof with the 
complementary oligonucleotides, and detection of the hybrids thus formed. Thus, a 
15 DNA chip according to the invention also relates to the membranes as used to 
perform Southern blotting. 

The oligonucleotides attached to the DNA chip according to the present 
invention are so attached by any conventional method known to those skilled in the 
20 art, and are approximately 50 bases long. It is understood that the oligonucleotides 
considered may also be shorter or longer. Thus, it is within the scope of those 
skilled in the art to determine the length of the oligonucleotides attached to the 
chip according to the invention, for each sequence. 

25 Preferably, the oligonucleotides attached to the DNA chip are chosen such 

that their sequence comprises a part of the hypervariable region identified 
according to the present invention. The oligonucleotides attached to the chip 
according to the invention may also contain sequences corresponding to the 
sequences variable to a lesser degree, located at or close to the end of the operon 

30 genes. 

In a particular and preferred implementation of the invention, the DNA 
chip according to the invention has a plurality (a number greater than or equal to 2, 
preferably 3, more preferably 5, most preferably 10) of oligonucleotides greater 
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than 40 bases long. Preferably, said oligonucleotides comprise a fragment of at 
least 20, preferably 40 or 50, more preferably 75, most preferably 100 consecutive 
bases, of the sequences SEQ ED No. 63 to SEQ ID No. 138 and SEQ ID No. 140 to 
SEQ ID No. 189, corresponding to the intergenic sequences of various species 
5 (rpoBC for SEQ ID No. 63 to SEQ ID No. 138, GroESL for SEQ ID No. 140 to 
SEQ ID No. 189). 

Thus, demonstrating the possible hybridizations of the amplified sequences 
makes it possible to identify the elements present in the microbial flora studied. 

10 

An operon which is particularly suitable for implementing the method 
according to the invention is the bacterial rpoBC operon. This bacterial operon 
contains coding sequences which are relatively homologous between genera. It is 
therefore possible to determine degenerate primers for amplifying a region which 

15 is heterologous between species and which corresponds to the transcribed 

intergenic region (IGR). In bacteria, the rpoBC operon encodes the beta and beta 
prime subunits of DNA-directed RNA polymerase, just like the homologous genes, 
which may or may not be conserved in the form of an operon, in mitochondria and 
other eukaryotic organelles (chloroplasts), and just like nuclear eukaryotic RNA 

20 polymerase II (which synthesizes the messenger RNAs). The study of this operon 
makes it possible not only to detect the bacteria, but also other eukaryotic 
microorganisms (yeast, protozoa, or others). 

The method according to the invention is thus carried out using degenerate 
25 primers located in the coding sequences of the operons, in particular at least one 
. primer chosen from the sequences SEQ ID No. 1 to SEQ ID No. 31, themselves a 
subject of the invention. The RNA polymerase proteins are in fact extremely 
conserved according to the species, which makes it possible to find amino acid 
sequences which align with one another, and thus to choose degenerate 
30 oligonucleotides for amplifying the intergenic sequences. 

The pairs of primers described by the sequences: (a sequence chosen from 
the sequences SEQ ID No. 1 to SEQ ID No. 8)/(a sequence chosen from the 
sequences SEQ ID No. 9 to SEQ ID No. 1 1) are used to perform a first 
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amplification of the intergenic, IGR, of the bacteria. A second, more specific, 
amplification can then be carried out using pairs of primers which hybridize within 
the first amplified region, and which are described by the sequences: (a sequence 
chosen from the sequences SEQ ID No. 12 to SEQ ID No. 15)/(a sequence chosen 
5 from the sequence SEQ ID No. 16 to SEQ ID No. 31). 

SEQ ID No. 1 GGNGAYAARY TNGCNGGNAG NCAYGG 
SEQ ID No. 2 GGNGAYAARY TNGCNGGNCG NCAYGG 
SEQ ID No. 3 GGNGAYAARY TNGCNAAYAG NCAYGG 
SEQ ID No. 4 GGNGAYAARY TNGCNAAYCG NCAYGG 
10 SEQ ID No. 5 GGNGAYAARA TGGCNGGNMG NCAYGG 

SEQ ID No. 6 GGNGAYAART TYGCNTCNMG NCAYGG 
SEQ ID No. 7 GGNGAYAART TYGCNAGYMG NCAYGG 
SEQ ID No. 8 GGNGAYAART TYGCNACNMG NCAYGG 
SEQ ID No. 9 AAYGCNGAYT TYGAYGGNGA YCARAT 
1 5 SEQ ID No. 1 0 AAYGCNGAYT TYGAYGGNC A RATGGC 

SEQ ID No. 1 1 AAYGCNGAYT TYGAYGGNGA YGARAT 
SEQ ID No. 12 GGNGGNCARM GNTTYGGNGA RATGGA 
SEQ ED No. 13 GGNGGNCAYG GNTTYGGNGA RATGGA 
SEQ ID No. 14 GGNGGNCARW SNTTYGGNGA RATGGA 
20 SEQ ID No. 1 5 GGNGGNNTNM GNTTYGGNGA RATGGA 

SEQ ID No. 16 GGNAARCGNG TNGAYTAYTC NGGNMG 
SEQ ID No. 17 GGNAARCGNG TNGAYTAYAG NGGNMG 



SEQ 


ID 


No. 


18 


GGNAARAGNG TNGAYTAYTC NGGNMG 


SEQ 


ID 


No. 


19 


GGNAARAGNG TNGAYTAYAG NGGNMG 


SEQ 


ID 


No. 


20 


GGNAARCGNG GNGAYTAYTC NGTNMG 


SEQ 


ID 


No. 


21 


GGNAARCGNG GNGAYTAYAG NGTNMG 


SEQ 


ID 


No. 


22 


GGNAARAGNG GNGAYTAYTC NGTNMG 


SEQ 


ID 


No. 


23 


GGNAARAGNG GNGAYTAYAG NGTNMG 


SEQ 


ID 


No. 


24 


GGNAARCGNG TNGAYTTYTC NGGNMG 


SEQ 


ID 


No. 


25 


GGNAARCGNG TNGAYTTYAG NGGNMG 


SEQ 


ID 


No. 


26 


GGNAARAGNG TNGAYTTYTC NGGNMG 


SEQ 


ID 


No. 


27 


GGNAARAGNG TNGAYTTYAG NGGNMG 


SEQ 


ID 


No. 


28 


GGNAARCGNG TNGAYTTYTC NGCNMG 
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SEQ ID No. 29 GGNAARCGNG TNGAYTTYAG NGCNMG 
SEQ ID No. 30 GGNAARAGNG TNGAYTTYTC NGCNMG 
SEQ ID No. 31 GGNAARAGNG TNGAYTTYAG NGCNMG 



5 The pairs of primers described by the sequences: (a sequence chosen from 

the sequence SEQ ID No. 53 to SEQ ID No. 54 are used to amplify a the intergenic 
region, IGR, of the bacteria. 



FO SEQ ID No. 53 GGNGGNCANN SNTTYGGNGA RATGGA 
10 RP SEQ ID No. 54 AAYGCNGAYT TYGAYGGNGA YS ARAT 

FO SEQ ID No. 55 GGNGGNCARM GNTTYGGNGA RATGGA 
SEQ ID No. 56 GGNGGNCAYG GNTTYGGNGA RATGGA 
SEQ ID No. 57 GGNGGNCARW SNTTYGGNGA RATGGA 
SEQ ID No. 58 GGNGGNNTNM GNTTYGGNGA RATGGA 
1 5 RP SEQ ID No. 59 AAYGCNGAYT TYGAYGGNGA YCARAT 

SEQ ID No. 60 AAYGCNGAYT TYGAYGGNCA RATGGC 
SEQ ID No. 61 AAYGCNGAYT TYGAYGGNGA YGARAT 

These primers were designed based on the study of the degeneracy of 
20 conserved protein motifs corresponding to rpoB and/or encoded by the rpoB gene: 



beta 2 I: 

coryneb/bif/actinom/camp/pseudom/salmon/esch/vibrio/clos/bact/h 

el/citrob 

25 /prot/haf/yers/past/actinob/aer 

SEQ ID No. 55 GGNGGNCARM GNTTYGGNGA RATGGA (8 deg) 
beta 2 ii: bacillus 

SEQ ID No. 56 GGNGGNCAYG GNTTYGGNGA RATGGA (7 deg) 
beta 2 iii: helicobacter mustelae 
30 SEQ ID No. 57 GGNGGNCARW SNTTYGGNGA RATGGA (8 deg) 

beta 2 iv: archae (methano) 

SEQ ID No. 58 GGNGGNNTNM GNTTYGGNGA RATGGA (9 deg) 
FO : 2 I/II/III: GGNGGNCANN SNTTYGGNGA RATGGA 
(SEQ ID No. 53) 
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For the reverse sequences, determined based on the degeneracy of 
conserved protein motifs corresponding to rpoC and/or encoded by the rpoC gene 

5 beta p 2 i: 

coryneb/bif/actinom/bac/camp/pseudom/salmon/esch/vibrio/clos/ba 
ct/hel/citrob/prot/haf/yers/past/actinob/aer/staph/lactob/enteroc/lactoc 

SEQ ID No. 59 AAYGCNGAYT TYGAYGGNGA YCARAT (8 deg) 
beta p 2 ii: archae (methano) 
1 0 SEQ ID No. 6 1 AAYGCNGAYT TYGAYGGNGA YG ARAT (8 deg) 

beta p 2 iii: streptoc 

SEQ ID No. 60 AAYGCNGAYT TYGAYGGNCA RATGGC (7 deg) 
RP :P 2 i/ii: AAYGCNGAYT TYGAYGGNGA YSARAT 
(SEQ ID No. 54) 

1 5 « REVERSE » ATYTSRTCNC CRTCRAARTC NGCRTT 

(SEQ ID No. 62) 

These primers are also part of the invention. 

A subject of the invention is also the genomic sequences of 

20 microorganisms which may be amplified by the primers according to the 
invention, in particular the pairs of primers: (a sequence chosen from the 
sequences SEQ ID No. 1 to SEQ ID No.8)/(a sequence chosen from sequences 
SEQ ED No. 9 to SEQ ID No. 1 1), and the pairs of primers: (a sequence chosen 
from the sequences SEQ ID No. 12 to SEQ ED No. 15)/(a sequence chosen from 

25 the sequences SEQ ID No. 16 to SEQ ID No. 31). Amplification with pairs of 
primers: (a sequence chosen from the sequences SEQ ID No. 53, SEQ ID No. 55 
to SEQ ID No. 58)/(a sequence chosen from the sequences SEQ ID No. 54, SEQ 
ID No. 59 to SEQ ID No. 61) is also envisioned. 

30 Thus, a subject of the invention is also in particular a sequence from SEQ 

ED No. 63 to SEQ ID No. 138, which correspond to the hypervariable intergenic 
regions of the rpoB operon or various organisms. A subject of the invention is also 
a fragment of a minimum of 20 bases, preferably 30 bases, more preferably 50 
bases, even more preferably 75 bases, most preferably 100 bases of one of the 
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sequences SEQ ID No. 63 to SEQ ID No. 138, or the sequences complementary 
thereto, it being possible for said fragment to be used to define organism-specific 
primers, or for the identification of organisms, in particular by hybridization. 

5 Thus, the DNA chip according to the invention preferably has, at its 

surface, a plurality of oligonucleotides (a minimum of two) comprising fragments 
chosen from the fragments of the sequences SEQ ID No. 63 to SEQ ID No. 138 
defined above, thus allowing the identification of microorganisms. The length of 
these oligonucleotides can be determined by those skilled in the art, as a function 
10 of the hybridization conditions which they intend to use. Oligonucleotides 
approximately 50 bases long are thus envisioned. 

Another operon which is particularly suitable for implementing the method 
according to the invention is the bacterial GroESL operon. This bacterial operon is 

15 bicistronic and contains coding sequences which are relatively homologous 

between genera. It is therefore also possible to determine degenerate primers to 
amplify a region which is heterologous between species and which corresponds to 
the transcribed intergenic region (IGR). In bacteria, the GroESL operon encodes 
the HSP10 and HSP60 proteins (heat shock proteins of 10 and 60 kDa 

20 respectively), just like the homologous genes, which may or may not be conserved 
in the form of an operon, in mitochondria and other eukaryotic organelles 
(chloroplasts). The study of this operon makes it possible not only to detect 
bacteria, but also other eukaryotic microorganisms (yeasts, protozoa, or others). 

25 The method according to the invention is thus carried out using degenerate 

primers located in the coding sequences of the operons. The HSP proteins are in 
fact extremely conserved according to species, which makes it possible to find 
amino acid sequences which align with one another, and thus to choose degenerate 
oligonucleotides to amplify the intergenic, promoter or terminator sequences. 

30 

Preferably, the primers described by the sequences SEQ ID No. 32 and 
SEQ ED No. 33 are used to amplify the intergenic region, IGR, of E. coli and of 
Enterobacteriaceae. 
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ENT-BDEG: 

CTGGAYGTKA ARRTNGGYGA YATYGT (SEQ ID No. 32) 
ENT-ADEG: 

ANNACNGTNG CRGTRGTGGT RCCGTC (SEQ ID No. 33) 



Other degenerate primers can also be used to implement the protocol 
according to the invention, in particular any primer chosen from the sequences 
SEQ ID No. 34 to SEQ ID No. 52. 



UNI-ADEG 1: 

GGNGAYGGNA CNACNACNGC NACNNT (SEQ ID No. 34) 
UNI-ADEG 2: 

GGNGAYGGNA CNACNACNTG NTCNNT (SEQ ID No. 35) 
ENT-BNEW: 

AANMTTCGTC CNYTRCANGA YCGNGT (SEQ ID No. 36) 
CLO-BNEW2: 

ATNARRCCAY TWGGWGAYMG NGTWGT (SEQ ID No. 37) 
BIF-BNEW: 

AARCCRCTCG AGGACMRNRT NSTSGT (SEQ ID No. 3 8) 

UNI- A3: 

GGNGAYGGNA CNAANACNGC NACNNT (SEQ ID No. 39) 
BIF-BNEW2: 

ATCAAGCCNC TMGRRGACMR SRTNST (SEQ ID No. 40) 
HEL-BNEW: 

NTNC ANCCNT TNGGNGANAG NGTNTT (SEQ ID No. 4 1 ) 
CAM-BNEW: 

NTNCANCCNT TNGGNAANCG NGTNCT (SEQ ID No. 42) 
BACT-BNEW: 

NTNAANCCNT TNGCNGANCG NGTNCT (SEQ ID No. 43) 
CHLA-BNEW: 

NTNAANCCNT TNGGNGANAG NATNTT (SEQ ID No. 44) 
MYCP-BNEW: 

NTNAAACCNNTNGGNAANCGNGTNAT (SEQ ID No. 45) 

STA-BNEW: 
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NTNAAACCNNTNGGNAANCGNGTNAT (SEQ ED No. 46) 

LACC-BNEW: 

TTGAAACCNTTAGNGRAYCGYGTRST (SEQ ID No. 47) 

LACB-BNEW: 

5 TTAMARCCAWTMGGNGATCGNGTNRT (SEQ ED No. 48) 

CLO-BNEW3: 

ATNANACCANTNGGNGACAGNGTNGT (SEQ ID No. 49) 
ENT-BNEW2: 

NTNCGNCCNTTNCANGANCGNGTNAT (SEQ ID No. 50) 

10 LEG-BNEW: 

NTNCGNCCNTTNCANGANCGNGTNGT (SEQ ID No. 5 1 ) 

AER-BNEW: 

NTNCGNCCNCTNCANGANCGNGTNAT (SEQ ID No. 52) 

LACB-BNEW2: 

1 5 MARCCNNTNG GNGAYMGNGT NATNGT (SEQ ID No. 1 39) 

These primers are also subjects of the present invention. Preferably, the 
detection of a microorganism is carried out using a pair of primers SEQ ID No. 32/ 
SEQ ID No. 33, or (SEQ ID No. 34, SEQ ID No. 35 or SEQ ID No. 39)/(a 
20 sequence chosen from the sequences SEQ ID No. 36 to SEQ ID No. 38 or SEQ ED 
No. 40 to SEQ ED No. 52). 

The sequences SEQ ID No. 36 to SEQ ID No. 38 and/or SEQ ID No. 40 to 
SEQ ID No. 52 and/or SEQ ED No. 139, used in particular in amplification 
25 reactions with sequences SEQ ED No. 34, SEQ ID No. 35 and/or SEQ ID No. 39, 
make it possible, respectively, to detect the microorganisms and species listed 
below. One or more pair(s) of sequences may be used in an amplification reaction. 

Thus, the sequences according to the present invention make it possible in 
30 particular to detect microorganisms of the following genera and families: 
Lactococcus (SEQ ID No. 39), Bifidibacterium (SEQ ED No. 38 and/or 40), 
Mycobacterium (SEQ ED No. 40), Helicobacter (SEQ ID No. 41), Campylobacter 
(SEQ ED No. 42), Bacteroides (SEQ ED No. 43), Chlamydia (SEQ ED No. 44), 
Mycoplasma (SEQ ID No. 45), Staphylococcus (SEQ ED No. 46), Lactococcus 
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and/or Streptococcus (SEQ ID No. 47), Lactobacillus and/or Bacillus (SEQ ED No. 
48), Clostridium (SEQ ID No. 37 and/or 49), Enterobacteriaceae (SEQ ID No. 36 
and/or 50), Pasteurella and/or Haemophilus (SEQ ID No. 50), Neisseria and/or 
Legionella (SEQ ID No. 51), Aeromonas and/or Bordetella (SEQ ID No. 52), 
5 Lactobacillus and/or Bacillus (SEQ ID No. 139). 

The subject of the invention is also the genomic sequences of 
microorganisms which can be amplified using the primers according to the 
invention, in particular the pairs of primers SEQ ID No. 32/ SEQ ID No. 33, and 
10 (SEQ ID No. 34, SEQ ID No. 35 or SEQ ID No. 39)/(a sequence chosen from the 
sequences SEQ ID No. 36 to SEQ ID No. 38, SEQ ID No. 40 to SEQ ID No. 52 or 
SEQ ID No. 139). 

Thus, a subject of the invention is also in particular a sequence from SEQ 
15 ID No. 140 to SEQ ID No. 189, which correspond to the hypervariable intergenic 
regions of the GroESL operon of various organisms. A subject of the invention is 
also any fragment of a minimum of 20 bases, preferably 30 bases, more preferably 
50 bases, even more preferably 75 bases, most preferably 100 bases of one of the 
sequences SEQ ID No. 140 to SEQ LD No. 189, or the sequences complementary 
20 thereto, it being possible for said fragment to be used to define organism-specific 
primers, or for the identification of organisms, in particular by hybridization. 

Thus, the DNA chip according to the invention preferably has, at its 
surface, a plurality of oligonucleotides (a minimum of two) comprising fragments 
25 chosen from the fragments of the sequences SEQ ID No. 140 to SEQ ID No. 1 89 
defined above, thus allowing the identification of the microorganisms. The length 
of these oligonucleotides can be determined by those skilled in the art, as a 
function of the hybridization conditions, which they intend to use. 
Oligonucleotides approximately 50 bases long are thus envisioned. 

30 

Subjects of the present invention are also diagnostic kits for carrying out 
the method according to the invention. These diagnostic kits contain degenerate 
primers for amplifying one or more intergenic regions of operon which is 
conserved among species. They may also contain the reagents required for the 
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amplification reaction. Moreover, DNA representing positive or negative controls 
may be included in the diagnostic kits according to the invention. 

A diagnostic kit according to the invention also advantageously contains 
5 the elements required for analyzing the amplified products. In particular, a 

diagnostic kit according to the invention contains a DNA chip according to the 
invention, which has, at its surface, the sequences corresponding to the various 
microorganisms. 

10 Depending on the species it is desired to detect, the diagnostic kit according 

to the invention contains the appropriate pair of primers and analytical elements. 
Furthermore, a kit according to the invention may also comprise instructions for 
carrying out the method according to the invention. 

15 A diagnostic kit according to the invention may also only contain a DNA 

chip according to the invention, and optionally also instructions for carrying out 
the analysis of fragments located in the intergenic region of operons which are 
conserved between species, the preferred operons being GroESL and rpoBC. 

20 The coupling of specific primers and probes thus allows rapid and precise 

identification of the flora of a given individual. It is therefore possible to establish 
the profile(s) of populations characteristic of healthy individuals. It is also possible 
to establish the standard profiles for various pathological conditions. 

25 The method according to the invention also provides the possibility of easy 

monitoring of the evolution of the flora as a function of diet. More specifically, it 
is possible to follow the effects, in the colon, of a particular food, such as a pre- or 
probiotic, or of a medicinal treatment such as a treatment with antibiotics. 

30 It is therefore possible to envision the development of foods or medicinal 

products for "effect on the flora" purposes, allowing reestablishment and a return 
to a normal profile after an imbalance subsequent to any pathological condition or 
attack. It is also possible to use primers and probes corresponding to pathogenic 
strains in order to optionally establish critical population thresholds preceding a 
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pathological condition. It is then possible to determine which of the other 
populations are liable to exert a barrier effect on these pathogens. 

The tools for diagnosing the intestinal flora developed and based on the 
5 method according to the invention (which are also subjects of the invention) are of 
interest firstly to industrial companies in the agrofoods and pharmacy domains, in 
order to develop their products and to determine the impact thereof on the 
intestinal flora. Specifically, particular diets are liable, in the long term, to 
significantly modify the composition of the flora and, consequently, to have 

10 harmful or beneficial effects depending on the types of population which appear or 
disappear. Similarly, medicinal treatments, and in particular antibiotic treatments, 
lead to imbalances in the microflora. Characterization of the populations affected 
according to the type of medicinal product would make it possible to set up a 
parallel or subsequent treatment capable of preventing these modifications or of 

15 reestablishing a correct flora as rapidly as possible. 

These tools are also of interest to health professionals, for characterizing 
the intestinal flora of patients, which may make it possible, for example, to direct a 
treatment. Specifically, gastroenterologists estimate that 70% of the population of 
20 industrialized countries complain of diverse digestive disorders, which are called 
functional colopathy, ranging from simple digestive disorders such as bloating or 
flatulence, to more significant disorders such as constipation or diarrhea, etc. The 
majority of their consultations concern this functional colopathy. 

25 Few solutions are provided to treat these disorders since, for certain types 

of colopathy, their cause is still quite unknown, and for others, there is no effective 
treatment. Added to this is the problem of the medical diagnosis since the patients 
presenting these symptoms of functional colopathy do not generally present any 
physical lesion in the colon. Only a questionnaire enables the gastroenterologist to 

30 turn toward a type of treatment, which proves to be relatively ineffective in the 
majority of cases. The market for products which can relieve these disorders is 
therefore considerable, as is that for diagnosis. Specifically, a diagnosis of the state 
of the flora of the patients might provide the physician with information regarding 
the causes of their disorders and the treatment to be carried out. 
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In order to select a genomic target of interest, namely a target which is 
conserved in all the genomes, the conservation of the most conserved operons in 
the course of evolution was studied based on the genomes of the 51 bacteria 
5 entirely sequenced and available on the NCBI server. These sequences were 
positioned relative to that of rpoB/C (beta operon). 



It emerged from this first analysis that the longest and most conserved 
targets are in fact the GroESL operons (encoding HsplO (groES) and Hsp60 
10 (groEL)) and a part of the beta operon corresponding to the rpoB and rpoC genes 
(encoding the beta and beta 1 subunits of DNA-directed RNA polymerase). In 
addition, it was possible to identify conserved protein motifs sufficiently long to 
allow the definition and then the synthesis of universal (ubiquitous), or almost 
universal, degenerate primers. 

15 

Thus, these two operons were chosen in order to exemplify the principle of 
the method according to the invention. 



Finally, the region of interest of the beta operon was amplified, i.e. the 
20 region amplifiable by PCR, using the two corresponding degenerate primers (FO 
and RP: SEQ ID No. 53 and SEQ ID No. 54) for selection of bacteria in order to 
establish the sequence thereof and to test them by hybridization on a nylon 
membrane so as to validate this specificity. These sequences were also aligned to 
their homologs available on GenBank in order to observe this specificity by 
25 bioinformatics. 



The same experiments were carried with the GroESL operon, and it can 
thus be shown that the method according to the invention makes it possible to 
identify and discriminate between the various species of microorganisms. 

30 

The following examples are intended to illustrate the invention, and should 
not be considered as limiting the invention. 

In the application, the abbreviations for the bacteria are as follows: 
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Bacillus subtilis (BS) CIP 52-65T; Bacteroides vulgatus (BV) DSM 1447; 
Bifidobacterium longum (BL) DSM 20219; Clostridium leptum (CL) DSM 753; 
Clostridium nexile (CN) DSM 1787; Clostridium spiroforme (CS) DSM 1552; 
Clostridium glycolycum (CG) DSM 1288; Lactobacillus gaseri (LG) DSM 20077; 
5 Lactobacillus helveticus (LH) CIP 103146; Lactobacillus paracasei (LP) DSM 
8741; Lactobacillus reuteri (LR) DSM 20053; Pseudomonas aeruginosa (PA) 
CIP1 00720; Ruminococcus hydrogenotrophicus (RH) DSM 10507; Citrobacter 
freundii (CF); Serratia liquefaciens (SL); Serratia marcescens (SM); Enterobacter 
cloacae (EnC); Escherichia coli (EsC); Morganella morganii (MM); Proteus 
10 mirabilis (PM); Klebsiella oxytoca (KO); Klebsiella pneumoniae (KP). 

DESCRIPTION OF THE FIGURES 

Figure 1 : Diagram of the rpoBC operon of E. coli. The universal 
(ubiquitous) primers are used to amplify the intergenic sequence. 

15 Figure 2 : Diagram of the groESL operon of E. coli. The universal primers 

are used to amplify the intergenic sequence. 

Figure 3 : Principle of a DNA chip. Specific sequences are attached to a 
solid support. The possible hybridization of the complementary sequences makes it 
possible to determine their presence in a sample. 

20 Figure 4 : Hybridization of deposits of 10 ng of DNA amplified by PCR 

with rpoBC primers (i) and of genomic DNA (ii), with a Serratia marcescens probe 
(~ 0.25 ng/ml) (A) or a Klebsiella oxytoca probe (~ 1 ng/ml) (B) for 18 hours at 
60°C and revelation for 30 minutes at 37°C. It is possible to observe cross 
hybridization of CF, SM, SL, EC and KP with the KO-DIG probe (~ 1 ng/ml) and 

25 of SL with the SM-DIG probe (~ 0.25 ng/ml). 

Figure 5 : Hybridization of deposits i (genomic DNA, a: 10 to 20 jag, b: 5 to 
10 jig, c: 0.5 to 1 ng, d: 50 to 100 ng, e: 5 to 10 ng, f: 0.5 to 1 ng) and ii (DNA 
amplified by PCR with GroESL primers, a: 50 to 100 ng, b: 5 to 10 ng, c: 0.5 to 1 
ng, d: 50 to 100 pg, e: 5 to 10 pg, f: 0.5 to 1 pg) with a PA-DIG probe (~ 10 ng/ml) 

30 for 18 hours at 42°C and revelation for 30 min at 37°C. 

Figure 6 : Hybridization of deposits i (DNA amplified by PCR with rpoBC 
primers: 10 ng/1 ng/100 pg) and ii (genome DNA: 1 fig/100 ng/10 ng) with an LR- 
DIG probe (~ 1 ng/ml) for 18 hours at 50, 55, 60 and 65°C and revelation for 30 
min at 37°C. 
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EXAMPLES 

Example 1 : Isolation of strains 

In order to have a broad and representative sample of human colonic flora, 
it is necessary to isolate new bacterial strains, also called nonculturable (and 
therefore unknown) strains, which constitute a high percentage of the bacteria of 
the human colonic flora. 

In order to perform these isolations, a large quantity of human stools is 
collected and sterilized by means of gamma-type radiation or by heat, for the 
purpose of sealing therein samples of these same human stools and of culturing 
them aerobically and anaerobically, in liquid and solid media. 

Depending on the culture conditions used, it is thus possible to isolate new 
genera, species or strains of colonic bacteria or other eukaryotic microorganisms. 

Example 2 : Characterization of the sequences of the isolated strains 

(rpoB) 

This involves carrying out the molecular characterization of sequences, 
ideally of mRNA to perform a quantification, and if not of genomic DNA, of the 
isolates of bacterial or eukaryotic microorganisms. 

The sequences corresponding to portions of the bacterial rpoBC operon are 
studied. The genes of this operon are in fact relatively homologous between 
genera. 

A computer analysis (sequence alignment) thus makes it possible to define 
degenerate primers for amplifying the region which is heterologous between 
species and which corresponds to the transcribed intergenic region. 
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Thus, the primers SEQ ID No. 1 and SEQ ID No. 31, and also the other 
primers, were defined after alignment of the sequences corresponding to more than 
50 species of living organisms (prokaryots and eukaryots, not shown) using the 
redundancy of the genetic code. The sequences SEQ ID No. 53 and SEQ ID No. 
54 are in particular preferred. 

The regions amplified by PCR or RT-PCR with the abovementioned 
primers can obviously be cloned into various vectors, in order to be used to refine 
the analysis (in particular in order to sequence them). 

Example 3 : Characterization of the sequences of the isolated strains 
(GroESL) 

The sequences corresponding to portions of the bicistronic bacterial 
GroESL operon are studied. The genes of this operon are in fact relatively 
homologous between genera. 

Computer analysis (sequence alignment) thus makes it possible to define 
degenerate primers for amplifying the region which is heterologous between 
species and which corresponds to the transcribed intergenic region. 

Thus, the primers SEQ ID No. 34 and SEQ ID No. 35 were defined after 
alignment of the sequences corresponding to more than 100 species of living 
organisms (prokaryots and eukaryots, not shown). 

The sequences SEQ ID No. 36 to SEQ ID No. 52, and in particular SEQ ID 
No. 139, correspond to complementary sequences which can be used to amplify 
microorganisms of diverse genera and/or families. 

As regards the primers SEQ ID No. 32 and SEQ ID No. 33, they were 
defined based on the conserved sequences of the GroES and GroEL genes of E. 
coli, using the degenerative genetic code. 

Example 4 : Amplification reactions (GroESL) 
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The PCR reactions are carried out according to the following protocol: 

2 ml of culture broth shaken at 37°C for 18 h are concentrated by 
centrifugation and resuspension of the bacterial pellet in 30 \x\ of distilled water, 
and then a 1/10 dilution of this concentrate, treated at 100°C for 10 minutes, is 
used as a matrix for the PCR reactions. The reaction conditions are 94°C/5 min, 
then 25 cycles of (94°C/30 sec, 60°C/45 sec, 72°C/30 sec), followed by an 
elongation step at 72°C for 7 min. 

Analysis of the amplificates makes it possible to show that it is possible to 
amplify, using the primers SEQ ID No. 32 and SEQ ID No. 33, the intergenic 
region of various enterobacteria, such as Escherichia coli, Enterobacter clocae, 
Morganella morganii, Serratia licquefasciens, Proteus mirabilis, Serratia 
marcescens, Klebsiella pneumoniae, Citrobacter freundii or Klebsiella oxytoca. 
The amplified region varies in length, according to the species, from 400 to 500 
base pairs (bp). Use of the pair SEQ ID No. 34 and SEQ ID No. 36 gives 
amplificates of between 550 and 650 bp in length. 

Use of the pairs: (SEQ ID No. 34, SEQ ID No. 35 or SEQ ID No.39)/(a 
sequence chosen from the sequences SEQ ID No. 36 to SEQ ID No. 38 or SEQ ID 
No. 40 to SEQ ID No. 52, or SEQ ID No. 139) makes it possible to amplify 
sequences specific to certain families and species, and to identify the organisms of 
these families or species. 

For the amplification reactions, use is preferably made of a primer marked 
"A" with a primer marked "B" 

The regions amplified by PCR or RT-PCR with the abovementioned 
primers can obviously be cloned into various vectors, in order to be used to refine 
the analysis (in particular in order to sequence them). 



PCR PROTOCOL 
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In order to show that using the intergenic region of the two operons of 
interest as a nucleic acid probe can make it possible to discriminate several 
bacterial species, said IGR for each target was amplified by direct PCR on 
bacterial suspensions. For the amplification reactions, use is preferably made of a 
5 primer marked "A" with a primer marked "B'\ 



2 ml of culture broth shaken at 37°C for 18 h are concentrated by 
centrifugation and resuspension of the bacterial pellet in 30 jil of distilled water, 
and then a 1/10 dilution of this concentrate, treated at 100°C for 10 minutes, is 
10 used as a matrix for the PCR reactions. 



groESL OPERON: 



The PCR reactions for this target are carried out at a Tm ranging between 
15 59°C and 60°C. The reaction conditions are 94°C/5 min, then 25 cycles of 

(94°C/30 sec, 60°C/45 sec, 72°C/30 sec), followed by an elongation step at 72°C 
for 7 min. The amplified intergenic regions are then observed by agarose gel 
electrophoresis using a 1 Kb + ladder (Gibco BRL). 

20 Analysis of the amplificates makes it possible to show that it is possible to 

amplify, using the primers SEQ ID No. 32 and SEQ ID No. 33, the intergenic 
region of various Enterobacteria, such as Escherichia coli, Enterobacter clocae, 
Morganella morganii, Serratia licquefasciens, Proteus mirabilis, Serratia 
marcescens, Klebsiella pneumoniae, Citrobacter freundii or Klebsiella oxytoca. 

25 The amplified region varies in length, according to species, from 400 to 500 base 
pairs (bp). Use of the pair SEQ ID No. 34 and SEQ ID No. 36 gives amplificates 
of between 550 and 650 bp in length. 

rpoB/C OPERON : 

30 

The PCR reactions for this target are carried out at a Tm ranging between 
63°C and 64°C. The reaction conditions are 94°C/4 min, then 30 cycles of 
(94°C/30 sec, 64°C/30 sec, 72°C/3 min), followed by an elongation step at 72°C 
for 12 min. The amplified intergenic regions are then observed by agarose gel 
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electrophoresis using a molecular weight marker III DNA ladder (ref: No. 528552; 
Boehringer Mannheim). 

Analysis of the amplificates makes it possible to show that it is possible to 
5 amplify, using the pair of primers SEQ ID No. 53 and SEQ ID No. 54, the 

intergenic region of the various bacteria, such as Escherichia coli, Clostridium 
leptwn, Klebsellia oxytoca, Lactococcus lactis, Citrobacter freundii, Serratia 
marcescens, Proteus mirabilis, Serratia liquefaciens, Morganella morganii, 
Enterobacter cloacae or Ruminococcus hydrogenotrophicus. 

10 

DNA fragments corresponding to the intergenic regions of the rpoB/C 
operon in various species were reamplified and analyzed using bands extracted 
from an agarose gel preparation. These fragments were prepurified with a Qiagen 
extraction kit. 

15 

The regions amplified by PCR or RT-PCR with the abovementioned 
primers can obviously be cloned into various vectors, in order to be used to refine 
the analysis (in particular in order to sequence them). 

20 HYBRIDIZATION PROTOCOL 

With a view to testing the specificity of the PCR products, for the species 
selected for our study, deposits of these DNAs were made on a nylon membrane 
according to a sodium hydroxide (NaOH) fixation protocol. The DNA 

25 concentrations for these deposits are given on the corresponding figures. These 
membranes were hybridized according to the protocol of the PCR DIG Probe 
Synthesis Kit (Roche) Cat. No. 1636090. The concentration of the probe used, 
synthesized according to the same protocol, is also indicated on the figures, as is 
the temperature of hybridization carried out overnight (18 h). The temperature of 

30 pre-hybridization is 65 °C for each experiment, and it lasts 45 min. 

Detection of this type of hybridization with this type of labeling (DIG) is 
termed colorimetric ("cold" labeling different from radioactive labeling). 
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Figures 4 to 6 show a specificity detection as a function of the organisms, 
although some crosshybridization reactions may exist. These reactions may be 
reduced by choosing probes which are shorter and located among the 
hypervariable intergenic sequences, as defined by SEQ ID No. 63 to SEQ ID No. 
5 138 (rpoN) or SEQ ID No. 40 to SEQ ID No. 189 (GroESL). 

Thus, a DNA chip with various probes located in the intergenic region will 
make it possible to recognize without hesitation the presence or absence of a 
microorganism, even when there is crosshybridization. Specifically, the presence 
10 of a microorganism will be deduced from the hybridization for each of the probes. 

It may therefore be advantageous to define DNA chips having specific 
probes corresponding to the intergenic region of each microorganism, but also to 
include several different probes for each microorganism. 
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